Syntactic annotation of medieval texts: the Syntactic Reference Corpus of Medieval French (SRCMF)

نویسندگان

  • Achim Stein
  • Sophie Prévost
چکیده

This article presents the Syntactic Reference Corpus of Medieval French (SRCMF). The corpus is composed of texts taken from the two major Old French corpora, the Base de Français Médiéval and the Nouveau Corpus d'Amsterdam. This contribution describes some of the core principles of the annotation model, which is based on dependency grammar, as well as the annotation procedure and representation formats.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building the Syntactic Reference Corpus of Medieval French Using NotaBene RDF Annotation Tool

In this paper, we introduce the NotaBene RDF Annotation Tool free software used to build the Syntactic Reference Corpus of Medieval French. It relies on a dependency-based model to manually annotate Old French texts from the Base de Français Médiéval and the Nouveau Corpus d’Amsterdam. NotaBene uses OWL ontologies to frame the terminology used in the annotation, which is displayed in a tree-lik...

متن کامل

Old French Dependency Parsing: Results of Two Parsers Analysed from a Linguistic Point of View

The treatment of medieval texts is a particular challenge for parsers. I compare how two dependency parsers, one graph-based, the other transition-based, perform on Old French, facing some typical problems of medieval texts: graphical variation, relatively free word order, and syntactic variation of several parameters over a diachronic period of about 300 years. Both parsers were trained and ev...

متن کامل

Much Ado About Nothing? On the Categorial Status of et and ne in Medieval French

When syntactically annotating a text corpus from an earlier stage of some language, one is confronted with the task of determining the categorial status of the elements encountered. This task can become arduous when some element seems to resist a clear-cut categorial assignment. In this case, one sees oneself in principle confronted with the choice between a 'consistent' approach (assignment of...

متن کامل

Syntactic Complexity of Russian Unified State Exam Texts in English: A Study on Reliability and Validity

In this study we analyze texts used in Russian Unified State Exam on English language. Texts that formed small research corpora were retrieved from 2 resources: official USE database as a reference point, and popular website used by pupils for USE training “Neznaika” (https://neznaika.pro/). The size of two corpora is balanced: USE has 11934 tokens and “Neznaika” - 11918 tokens. We share Biber’...

متن کامل

Prosody in a corpus of French spontaneous speech: perception, annotation and prosody ~ syntax interaction

Our study focuses on the issue of prosodic annotation and of the prosody ~ syntax interface in conversation and is based on a large corpus of conversational speech in French. The results of inter-transcriber agreement tests show that two expert transcribers are consistent in their labeling of prosodic phrasing and the consistency is well above the chance. A qualitative analysis reveals transcri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012